research-article

Open Access

Autoscheduling for sparse tensor algebra with an asymptotic cost model

Authors:
Willow Ahrens

Massachusetts Institute of Technology, USA

Massachusetts Institute of Technology, USA
View Profile

,
Fredrik Kjolstad

Stanford University, USA

Stanford University, USA
View Profile

,
Saman Amarasinghe

Massachusetts Institute of Technology, USA

Massachusetts Institute of Technology, USA
View Profile

PLDI 2022: Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and ImplementationJune 2022Pages 269–285https://doi.org/10.1145/3519939.3523442

Published:09 June 2022Publication History

PLDI 2022: Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation

Pages 269–285

ABSTRACT

While loop reordering and fusion can make big impacts on the constant-factor performance of dense tensor programs, the effects on sparse tensor programs are asymptotic, often leading to orders of magnitude performance differences in practice. Sparse tensors also introduce a choice of compressed storage formats that can have asymptotic effects. Research into sparse tensor compilers has led to simplified languages that express these tradeoffs, but the user is expected to provide a schedule that makes the decisions. This is challenging because schedulers must anticipate the interaction between sparse formats, loop structure, potential sparsity patterns, and the compiler itself. Automating this decision making process stands to finally make sparse tensor compilers accessible to end users.

We present, to the best of our knowledge, the first automatic asymptotic scheduler for sparse tensor programs. We provide an approach to abstractly represent the asymptotic cost of schedules and to choose between them. We narrow down the search space to a manageably small Pareto frontier of asymptotically non-dominating kernels. We test our approach by compiling these kernels with the TACO sparse tensor compiler and comparing them with those generated with the default TACO schedules. Our results show that our approach reduces the scheduling space by orders of magnitude and that the generated kernels perform asymptotically better than those generated using the default schedules.

References

Martin Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16).Google ScholarDigital Library
Andrew Adams, Karima Ma, Luke Anderson, Riyadh Baghdadi, Tzu-Mao Li, Michaël Gharbi, Benoit Steiner, Steven Johnson, Kayvon Fatahalian, Frédo Durand, and Jonathan Ragan-Kelley. 2019. Learning to optimize halide with tree search and random programs. ACM Trans. Graph., 38, 4 (2019), July, issn:0730-0301 https://doi.org/10.1145/3306346.3322967 Google ScholarDigital Library
Luke Anderson, Andrew Adams, Karima Ma, Tzu-Mao Li, Tian Jin, and Jonathan Ragan-Kelley. 2021. Efficient automatic scheduling of imaging and vision pipelines for the GPU. Proc. ACM Program. Lang., 5, OOPSLA (2021), Oct., https://doi.org/10.1145/3485486 Google ScholarDigital Library
J. Ansel, S. Kamil, K. Veeramachaneni, J. Ragan-Kelley, J. Bosboom, U. O’Reilly, and S. Amarasinghe. 2014. OpenTuner: An Extensible Framework for Program Autotuning. In International Conference on Parallel Architectures and Compilation Techniques (PACT). isbn:978-1-4503-2809-8 https://doi.org/10.1145/2628071.2628092 Google ScholarDigital Library
Gilad Arnold. 2011. Data-Parallel Language for Correct and Efficient Sparse Matrix Codes. University of California, Berkeley.Google Scholar
Alexander A. Auer, Gerald Baumgartner, David E. Bernholdt, Alina Bibireata, Venkatesh Choppella, Daniel Cociorva, Xiaoyang Gao, Robert Harrison, Sriram Krishnamoorthy, Sandhya Krishnan, Chi-Chung Lam, Qingda Lu, Marcel Nooijen, Russell Pitzer, J. Ramanujam, P. Sadayappan, and Alexander Sibiryakov. 2006. Automatic code generation for many-body electronic structure methods: the tensor contraction engine. Molecular Physics, 104, 2 (2006), Jan., issn:0026-8976 https://doi.org/10.1080/00268970500275780 Google ScholarCross Ref
Riyadh Baghdadi, Jessica Ray, Malek Ben Romdhane, Emanuele Del Sozzo, Abdurrahman Akkas, Yunming Zhang, Patricia Suriana, Shoaib Kamil, and Saman Amarasinghe. 2019. Tiramisu: a polyhedral compiler for expressing fast and portable code. In Proceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization. isbn:978-1-72811-436-1Google ScholarCross Ref
Prasanna Balaprakash, Jack Dongarra, Todd Gamblin, Mary Hall, Jeffrey K. Hollingsworth, Boyana Norris, and Richard Vuduc. 2018. Autotuning in High-Performance Computing Applications. Proc. IEEE, 106, 11 (2018), Nov., issn:0018-9219, 1558-2256 https://doi.org/10.1109/JPROC.2018.2841200 Google ScholarCross Ref
Jérémy Barbay, Alejandro López-Ortiz, Tyler Lu, and Alejandro Salinger. 2010. An experimental investigation of set intersection algorithms for text searching. ACM J. Exp. Algorithmics, 14 (2010), Jan., issn:1084-6654 https://doi.org/10.1145/1498698.1564507 Google ScholarDigital Library
Jeff Bezanson, Stefan Karpinski, Viral B. Shah, and Alan Edelman. 2012. Julia: A Fast Dynamic Language for Technical Computing. arXiv:1209.5145 [cs], Sept..Google Scholar
Aart J. C. Bik, Penporn Koanantakool, Tatiana Shpeisman, Nicolas Vasilache, Bixia Zheng, and Fredrik Kjolstad. 2022. Compiler Support for Sparse Tensor Computations in MLIR. arXiv:2202.04305 [cs], Feb..Google Scholar
Aart J. C. Bik and Harry A. G. Wijshoff. 1993. Compilation techniques for sparse matrix computations. In Proceedings of the 7th international conference on Supercomputing. isbn:978-0-89791-600-4 https://doi.org/10.1145/165939.166023 Google ScholarDigital Library
Aart J. C. Bik and Harry A. G. Wijshoff. 1994. Nonzero structure analysis. In Proceedings of the 8th international conference on Supercomputing. isbn:978-0-89791-665-3 https://doi.org/10.1145/181181.181538 Google ScholarDigital Library
Aart J. C. Bik and Harry A. G. Wijshoff. 1994. On automatic data structure selection and code generation for sparse computations. In Languages and Compilers for Parallel Computing. isbn:978-3-540-48308-3 https://doi.org/10.1007/3-540-57659-2_4 Google ScholarCross Ref
Aydin Buluc and John R. Gilbert. 2008. On the representation and multiplication of hypersparse matrices. In 2008 IEEE International Symposium on Parallel and Distributed Processing. https://doi.org/10.1109/IPDPS.2008.4536313 Google ScholarCross Ref
Ashok K. Chandra and Philip M. Merlin. 1977. Optimal implementation of conjunctive queries in relational data bases. In Proceedings of the ninth annual ACM symposium on Theory of computing. isbn:978-1-4503-7409-5 https://doi.org/10.1145/800105.803397 Google ScholarDigital Library
Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. 2018. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). isbn:978-1-939133-08-3Google Scholar
Kazem Cheshmi, Shoaib Kamil, Michelle Mills Strout, and Maryam Mehri Dehnavi. 2017. Sympiler: transforming sparse matrix codes by decoupling symbolic analysis. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. isbn:978-1-4503-5114-0 https://doi.org/10.1145/3126908.3126936 Google ScholarDigital Library
Lam Chi-Chung, P. Sadayappan, and Rephael Wenger. 1997. On Optimizing a Class of Multi-Dimensional Loops with Reduction for Parallel Execution. Parallel Process. Lett., 07, 02 (1997), June, issn:0129-6264 https://doi.org/10.1142/S0129626497000176 Google ScholarCross Ref
Jee W. Choi, Amik Singh, and Richard W. Vuduc. 2010. Model-driven autotuning of sparse matrix-vector multiply on GPUs. SIGPLAN Not., 45, 5 (2010), Jan., issn:0362-1340 https://doi.org/10.1145/1837853.1693471 Google ScholarDigital Library
Stephen Chou, Fredrik Kjolstad, and Saman Amarasinghe. 2018. Format abstraction for sparse tensor algebra compilers. Proc. ACM Program. Lang., 2, OOPSLA (2018), Oct., https://doi.org/10.1145/3276493 Google ScholarDigital Library
Stephen Chou, Fredrik Kjolstad, and Saman Amarasinghe. 2020. Automatic generation of efficient sparse tensor format conversion routines. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation. isbn:978-1-4503-7613-6 https://doi.org/10.1145/3385412.3385963 Google ScholarDigital Library
R. Clint Whaley, Antoine Petitet, and Jack J. Dongarra. 2001. Automated empirical optimizations of software and the ATLAS project. Parallel Comput., 27, 1 (2001), Jan., issn:0167-8191 https://doi.org/10.1016/S0167-8191(00)00087-9 Google ScholarDigital Library
M. Frigo and S.G. Johnson. 1998. FFTW: an adaptive software architecture for the FFT. In Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP ’98 (Cat. No.98CH36181). 3, https://doi.org/10.1109/ICASSP.1998.681704 Google ScholarCross Ref
Shashi Gowda, Yingbo Ma, Alessandro Cheli, Maja Gwozdz, Viral B. Shah, Alan Edelman, and Christopher Rackauckas. 2021. High-performance symbolic-numerics via multiple dispatch. arXiv:2105.03949 [cs], May.Google Scholar
Goetz Graefe. 1993. Query evaluation techniques for large databases. ACM Comput. Surv., 25, 2 (1993), June, issn:0360-0300 https://doi.org/10.1145/152610.152611 Google ScholarDigital Library
Johnnie Gray and Stefanos Kourtis. 2021. Hyper-optimized tensor network contraction. Quantum, 5 (2021), March, https://doi.org/10.22331/q-2021-03-15-410 Google ScholarCross Ref
Tobias Grosser, Armin Groesslinger, and Christian Lengauer. 2012. Polly — performing polyhedral optimizations on a low-level intermediate representation. Parallel Process. Lett., 22, 04 (2012), Dec., issn:0129-6264 https://doi.org/10.1142/S0129626412500107 Google ScholarCross Ref
Fred G. Gustavson. 1978. Two Fast Algorithms for Sparse Matrices: Multiplication and Permuted Transposition. ACM Trans. Math. Softw., 4, 3 (1978), Sept., issn:0098-3500 https://doi.org/10.1145/355791.355796 Google ScholarDigital Library
Albert Hartono, Qingda Lu, Thomas Henretty, Sriram Krishnamoorthy, Huaijian Zhang, Gerald Baumgartner, David E. Bernholdt, Marcel Nooijen, Russell Pitzer, J. Ramanujam, and P. Sadayappan. 2009. Performance Optimization of Tensor Contraction Expressions for Many-Body Methods in Quantum Chemistry. J. Phys. Chem. A, 113, 45 (2009), Nov., issn:1089-5639 https://doi.org/10.1021/jp9051215 Google ScholarCross Ref
Rawn Henry, Olivia Hsu, Rohan Yadav, Stephen Chou, Kunle Olukotun, Saman Amarasinghe, and Fredrik Kjolstad. 2021. Compilation of sparse array programming models. Proc. ACM Program. Lang., 5, OOPSLA (2021), Oct., https://doi.org/10.1145/3485505 Google ScholarDigital Library
Hwansoo Han and Chau-Wen Tseng. 2006. Exploiting locality for irregular scientific codes. IEEE Transactions on Parallel and Distributed Systems, 17, 7 (2006), July, issn:1558-2183 https://doi.org/10.1109/TPDS.2006.88 Google ScholarDigital Library
Fredrik Kjolstad, Willow Ahrens, Shoaib Kamil, and Saman Amarasinghe. 2019. Tensor Algebra Compilation with Workspaces. In 2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). https://doi.org/10.1109/CGO.2019.8661185 Google ScholarCross Ref
Fredrik Kjolstad, Shoaib Kamil, Stephen Chou, David Lugato, and Saman Amarasinghe. 2017. The Tensor Algebra Compiler. Proc. ACM Program. Lang., 1, OOPSLA (2017), Oct., issn:2475-1421 https://doi.org/10.1145/3133901 Google ScholarDigital Library
Phokion G. Kolaitis and Moshe Y. Vardi. 2000. Conjunctive-Query Containment and Constraint Satisfaction. J. Comput. System Sci., 61, 2 (2000), Oct., issn:0022-0000 https://doi.org/10.1006/jcss.2000.1713 Google ScholarDigital Library
George Konstantinidis and Jose Luis Ambite. 2013. Scalable containment for unions of conjunctive queries under constraints. In Proceedings of the Fifth Workshop on Semantic Web Information Management - SWIM ’13. isbn:978-1-4503-2194-5 https://doi.org/10.1145/2484712.2484716 Google ScholarDigital Library
Vladimir Kotlyar. 1999. Relational Algebraic Techniques for the Synthesis of Sparse Matrix Programs. Cornell.Google Scholar
Vladimir Kotlyar, Keshav Pingali, and Paul Stodghill. 1997. Compiling parallel sparse code for user-defined data structures. Cornell.Google Scholar
Vladimir Kotlyar, Keshav Pingali, and Paul Stodghill. 1997. A relational approach to the compilation of sparse matrix programs. In Euro-Par’97 Parallel Processing. isbn:978-3-540-69549-3 https://doi.org/10.1007/BFb0002751 Google ScholarCross Ref
Dimitrios Koutsoukos, Supun Nakandala, Konstantinos Karanasos, Karla Saur, Gustavo Alonso, and Matteo Interlandi. 2021. Tensors: an abstraction for general data processing. Proc. VLDB Endow., 14, 10 (2021), June, issn:2150-8097 https://doi.org/10.14778/3467861.3467869 Google ScholarDigital Library
Weifeng Liu and Brian Vinter. 2015. CSR5: An Efficient Storage Format for Cross-Platform Sparse Matrix-Vector Multiplication. In Proceedings of the 29th ACM on International Conference on Supercomputing. isbn:978-1-4503-3559-1 https://doi.org/10.1145/2751205.2751209 Google ScholarDigital Library
Shangyu Luo, Dimitrije Jankov, Binhang Yuan, and Chris Jermaine. 2021. Automatic Optimization of Matrix Implementations for Distributed Machine Learning and Linear Algebra. In Proceedings of the 2021 International Conference on Management of Data. isbn:978-1-4503-8343-1 https://doi.org/10.1145/3448016.3457317 Google ScholarDigital Library
John Michael McNamee. 1971. Algorithm 408: a sparse matrix package (part I) [F4]. Commun. ACM, 14, 4 (1971), April, issn:0001-0782 https://doi.org/10.1145/362575.362584 Google ScholarDigital Library
Mahdi Soltan Mohammadi, Tomofumi Yuki, Kazem Cheshmi, Eddie C. Davis, Mary Hall, Maryam Mehri Dehnavi, Payal Nandy, Catherine Olschanowsky, Anand Venkat, and Michelle Mills Strout. 2019. Sparse computation data dependence simplification for efficient compiler-generated inspectors. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation. isbn:978-1-4503-6712-7 https://doi.org/10.1145/3314221.3314646 Google ScholarDigital Library
Ravi Teja Mullapudi, Andrew Adams, Dillon Sharlet, Jonathan Ragan-Kelley, and Kayvon Fatahalian. 2016. Automatically scheduling halide image processing pipelines. ACM Trans. Graph., 35, 4 (2016), July, issn:0730-0301 https://doi.org/10.1145/2897824.2925952 Google ScholarDigital Library
Luigi Nardi, Artur Souza, David Koeplinger, and Kunle Olukotun. 2019. HyperMapper: a Practical Design Space Exploration Framework. In 2019 IEEE 27th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS). https://doi.org/10.1109/MASCOTS.2019.00053 Google ScholarCross Ref
Israt Nisa, Charles Siegel, Aravind Sukumaran Rajam, Abhinav Vishnu, and P. Sadayappan. 2018. Effective Machine Learning Based Format Selection and Performance Modeling for SpMV on GPUs. In 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). https://doi.org/10.1109/IPDPSW.2018.00164 Google ScholarCross Ref
William Pugh and Tatiana Shpeisman. 1999. SIPR: A New Framework for Generating Efficient Code for Sparse Matrix Computations. In Languages and Compilers for Parallel Computing. isbn:978-3-540-48319-9 https://doi.org/10.1007/3-540-48319-5_14 Google ScholarCross Ref
Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman Amarasinghe. 2013. Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation. isbn:978-1-4503-2014-6 https://doi.org/10.1145/2491956.2462176 Google ScholarDigital Library
Ari Rasch, Michael Haidl, and Sergei Gorlatch. 2017. ATF: A Generic Auto-Tuning Framework. In 2017 IEEE 19th International Conference on High Performance Computing and Communications; IEEE 15th International Conference on Smart City; IEEE 3rd International Conference on Data Science and Systems (HPCC/SmartCity/DSS). https://doi.org/10.1109/HPCC-SmartCity-DSS.2017.9 Google ScholarCross Ref
Ryan Senanayake, Changwan Hong, Ziheng Wang, Amalee Wilson, Stephen Chou, Shoaib Kamil, Saman Amarasinghe, and Fredrik Kjolstad. 2020. A sparse iteration space transformation framework for sparse tensor algebra. Proc. ACM Program. Lang., 4, OOPSLA (2020), Nov., https://doi.org/10.1145/3428226 Google ScholarDigital Library
Michelle Mills Strout, Mary Hall, and Catherine Olschanowsky. 2018. The Sparse Polyhedral Framework: Composing Compiler-Generated Inspector-Executor Code. Proc. IEEE, 106, 11 (2018), Nov., issn:1558-2256 https://doi.org/10.1109/JPROC.2018.2857721 Google ScholarCross Ref
Michelle Mills Strout, Alan LaMielle, Larry Carter, Jeanne Ferrante, Barbara Kreaseck, and Catherine Olschanowsky. 2016. An approach for code generation in the Sparse Polyhedral Framework. Parallel Comput., 53 (2016), April, issn:0167-8191 https://doi.org/10.1016/j.parco.2016.02.004 Google ScholarDigital Library
Ruiqin Tian, Luanzheng Guo, Jiajia Li, Bin Ren, and Gokcen Kestor. 2021. A High-Performance Sparse Tensor Algebra Compiler in Multi-Level IR. arXiv:2102.05187 [cs], Feb..Google Scholar
Nicolas Vasilache, Oleksandr Zinenko, Theodoros Theodoridis, Priya Goyal, Zachary DeVito, William S. Moses, Sven Verdoolaege, Andrew Adams, and Albert Cohen. 2018. Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions. arXiv:1802.04730 [cs], June.Google Scholar
Anand Venkat, Mary Hall, and Michelle Strout. 2015. Loop and data transformations for sparse matrix code. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation. isbn:978-1-4503-3468-6 https://doi.org/10.1145/2737924.2738003 Google ScholarDigital Library
Anand Venkat, Mahdi Soltan Mohammadi, Jongsoo Park, Hongbo Rong, Rajkishore Barik, Michelle Mills Strout, and Mary Hall. 2016. Automating Wavefront Parallelization for Sparse Matrix Computations. In SC ’16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. https://doi.org/10.1109/SC.2016.40 Google ScholarCross Ref
Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J. van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod Millman, Nikolay Mayorov, Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, C. J. Carey, İlhan Polat, Yu Feng, Eric W. Moore, Jake VanderPlas, Denis Laxalde, Josef Perktold, Robert Cimrman, Ian Henriksen, E. A. Quintero, Charles R. Harris, Anne M. Archibald, Antônio H. Ribeiro, Fabian Pedregosa, and Paul van Mulbregt. 2020. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat Methods, 17, 3 (2020), March, issn:1548-7105 https://doi.org/10.1038/s41592-019-0686-2 Google ScholarCross Ref
Richard W. Vuduc. 2004. Automatic performance tuning of sparse matrix kernels. Ph. D. Dissertation. University of California.Google Scholar
Yisu Remy Wang, Shana Hutchison, Jonathan Leang, Bill Howe, and Dan Suciu. 2020. SPORES: sum-product optimization via relational equality saturation for large scale linear algebra. Proc. VLDB Endow., 13, 12 (2020), Aug., issn:2150-8097 https://doi.org/10.14778/3407790.3407799 Google ScholarDigital Library
Ziheng Wang. 2020. Automatic optimization of sparse tensor algebra programs. Massachusetts Institute of Technology.Google Scholar
Samuel Williams, Leonid Oliker, Richard Vuduc, John Shalf, Katherine Yelick, and James Demmel. 2007. Optimization of sparse matrix-vector multiplication on emerging multicore platforms. In Proceedings of the 2007 ACM/IEEE conference on Supercomputing. isbn:978-1-59593-764-3 https://doi.org/10.1145/1362622.1362674 Google ScholarDigital Library
Yongyang Yu, Mingjie Tang, and Walid G. Aref. 2021. Scalable Relational Query Processing on Big Matrix Data. arXiv:2110.01767 [cs], Oct..Google Scholar
Binhang Yuan, Dimitrije Jankov, Jia Zou, Yuxin Tang, Daniel Bourgeois, and Chris Jermaine. 2021. Tensor relational algebra for distributed machine learning system design. Proc. VLDB Endow., 14, 8 (2021), April, issn:2150-8097 https://doi.org/10.14778/3457390.3457399 Google ScholarDigital Library
Yue Zhao, Jiajia Li, Chunhua Liao, and Xipeng Shen. 2018. Bridging the gap between deep learning and sparse matrix format selection. In Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. isbn:978-1-4503-4982-6 https://doi.org/10.1145/3178487.3178495 Google ScholarDigital Library

Index Terms

Autoscheduling for sparse tensor algebra with an asymptotic cost model

Index terms have been assigned to the content through auto-classification.

Recommendations

Sparse Tensor Algebra Compilation
Read More
An efficient mixed-mode representation of sparse tensors
SC '19: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis

The Compressed Sparse Fiber (CSF) representation for sparse tensors is a generalization of the Compressed Sparse Row (CSR) format for sparse matrices. For a tensor with d modes, typical tensor methods such as CANDECOMP/PARAFAC decomposition (CPD) ...
Read More
Sparse Tensor Transpositions
SPAA '20: Proceedings of the 32nd ACM Symposium on Parallelism in Algorithms and Architectures

We present a new algorithm for transposing sparse tensors called Quesadilla. The algorithm converts the sparse tensor data structure to a list of coordinates and sorts it with a fast multi-pass radix algorithm that exploits knowledge of the requested ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
PLDI 2022: Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation
June 2022
1038 pages
ISBN:9781450392655
DOI:10.1145/3519939
General Chair:
Ranjit Jhala
University of California at San Diego, USA
,
Program Chair:
Işil Dillig
University of Texas at Austin, USA
Copyright © 2022 Owner/Author
This work is licensed under a Creative Commons Attribution 4.0 International License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 June 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
- Artifacts Available / v1.1
- Artifacts Evaluated & Reusable / v1.1
Author Tags
Asymptotic Analysis
Automatic Scheduling
Compilers
Conjunctive Query Containment
Query Optimization
Sparse Tensors
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate406of2,067submissions,20%
Upcoming Conference
PLDI '24

Sponsor:

sigplan

ACM SIGPLAN Conference on Programming Language Design and Implementation

June 24 - 28, 2024

Copenhagen , Denmark
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 8
  Total Citations
  View Citations
- 1,224
  Total Downloads
- Downloads (Last 12 months)601
- Downloads (Last 6 weeks)69
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Autoscheduling for sparse tensor algebra with an asymptotic cost model

PLDI 2022: Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation

ABSTRACT

References

Cited By

Index Terms

Recommendations

Sparse Tensor Algebra Compilation

An efficient mixed-mode representation of sparse tensors

Sparse Tensor Transpositions